Overview

Dataset statistics

Number of variables10
Number of observations5570
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory435.3 KiB
Average record size in memory80.0 B

Variable types

Categorical1
Numeric9

Alerts

MUNICIPIO has a high cardinality: 5297 distinct values High cardinality
POP_EST is highly correlated with NRO_EMP and 1 other fieldsHigh correlation
IDHM is highly correlated with PIBCAPHigh correlation
PIBCAP is highly correlated with IDHMHigh correlation
NRO_EMP is highly correlated with POP_ESTHigh correlation
MASSA_PCAP is highly correlated with MASSA_PCAP_POPHigh correlation
MASSA_PCAP_POP is highly correlated with MASSA_PCAPHigh correlation
DESP_TOT_RSU is highly correlated with POP_ESTHigh correlation
POP_EST is highly correlated with NRO_EMP and 1 other fieldsHigh correlation
NRO_EMP is highly correlated with POP_EST and 1 other fieldsHigh correlation
MASSA_PCAP is highly correlated with MASSA_PCAP_POPHigh correlation
MASSA_PCAP_POP is highly correlated with MASSA_PCAPHigh correlation
DESP_TOT_RSU is highly correlated with POP_EST and 1 other fieldsHigh correlation
IDHM is highly correlated with PIBCAPHigh correlation
PIBCAP is highly correlated with IDHMHigh correlation
MASSA_PCAP is highly correlated with MASSA_PCAP_POPHigh correlation
MASSA_PCAP_POP is highly correlated with MASSA_PCAPHigh correlation
POP_EST is highly correlated with DENS_DEM and 2 other fieldsHigh correlation
DENS_DEM is highly correlated with POP_EST and 2 other fieldsHigh correlation
NRO_EMP is highly correlated with POP_EST and 2 other fieldsHigh correlation
MASSA_PCAP is highly correlated with MASSA_PCAP_POPHigh correlation
MASSA_PCAP_POP is highly correlated with MASSA_PCAPHigh correlation
DESP_TOT_RSU is highly correlated with POP_EST and 2 other fieldsHigh correlation
POP_EST is highly skewed (γ1 = 37.10739116) Skewed
NRO_EMP is highly skewed (γ1 = 34.98315526) Skewed
DESP_TOT_RSU is highly skewed (γ1 = 41.42497498) Skewed
MUNICIPIO is uniformly distributed Uniform
NRO_EMP has 3558 (63.9%) zeros Zeros

Reproduction

Analysis started2022-09-10 01:07:23.844761
Analysis finished2022-09-10 01:08:09.368260
Duration45.52 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

MUNICIPIO
Categorical

HIGH CARDINALITY
UNIFORM

Distinct5297
Distinct (%)95.1%
Missing0
Missing (%)0.0%
Memory size43.6 KiB
Bom Jesus
 
5
São Domingos
 
5
São Francisco
 
4
Planalto
 
4
Santa Helena
 
4
Other values (5292)
5548 

Length

Max length32
Median length27
Mean length11.60771993
Min length3

Characters and Unicode

Total characters64655
Distinct characters71
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5065 ?
Unique (%)90.9%

Sample

1st rowAbadia de Goiás
2nd rowAbadia dos Dourados
3rd rowAbadiânia
4th rowAbaeté
5th rowAbaetetuba

Common Values

ValueCountFrequency (%)
Bom Jesus5
 
0.1%
São Domingos5
 
0.1%
São Francisco4
 
0.1%
Planalto4
 
0.1%
Santa Helena4
 
0.1%
Bonito4
 
0.1%
Santa Terezinha4
 
0.1%
Vera Cruz4
 
0.1%
Santa Inês4
 
0.1%
Santa Luzia4
 
0.1%
Other values (5287)5528
99.2%

Length

2022-09-09T22:08:11.581670image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
do757
 
7.4%
são364
 
3.5%
de300
 
2.9%
santa161
 
1.6%
da143
 
1.4%
nova135
 
1.3%
sul115
 
1.1%
rio94
 
0.9%
dos73
 
0.7%
josé70
 
0.7%
Other values (3959)8071
78.5%

Most occurring characters

ValueCountFrequency (%)
a8789
 
13.6%
o5959
 
9.2%
4713
 
7.3%
r4531
 
7.0%
i4388
 
6.8%
e3758
 
5.8%
n3197
 
4.9%
d2553
 
3.9%
s2419
 
3.7%
t2291
 
3.5%
Other values (61)22057
34.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter50859
78.7%
Uppercase Letter9009
 
13.9%
Space Separator4713
 
7.3%
Other Punctuation47
 
0.1%
Dash Punctuation27
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a8789
17.3%
o5959
11.7%
r4531
8.9%
i4388
8.6%
e3758
 
7.4%
n3197
 
6.3%
d2553
 
5.0%
s2419
 
4.8%
t2291
 
4.5%
u2154
 
4.2%
Other values (28)10820
21.3%
Uppercase Letter
ValueCountFrequency (%)
S1136
12.6%
C971
10.8%
P911
 
10.1%
M721
 
8.0%
A697
 
7.7%
B602
 
6.7%
I475
 
5.3%
J405
 
4.5%
G392
 
4.4%
R367
 
4.1%
Other values (20)2332
25.9%
Space Separator
ValueCountFrequency (%)
4713
100.0%
Other Punctuation
ValueCountFrequency (%)
'47
100.0%
Dash Punctuation
ValueCountFrequency (%)
-27
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin59868
92.6%
Common4787
 
7.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a8789
14.7%
o5959
 
10.0%
r4531
 
7.6%
i4388
 
7.3%
e3758
 
6.3%
n3197
 
5.3%
d2553
 
4.3%
s2419
 
4.0%
t2291
 
3.8%
u2154
 
3.6%
Other values (58)19829
33.1%
Common
ValueCountFrequency (%)
4713
98.5%
'47
 
1.0%
-27
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII61812
95.6%
None2843
 
4.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a8789
14.2%
o5959
 
9.6%
4713
 
7.6%
r4531
 
7.3%
i4388
 
7.1%
e3758
 
6.1%
n3197
 
5.2%
d2553
 
4.1%
s2419
 
3.9%
t2291
 
3.7%
Other values (44)19214
31.1%
None
ValueCountFrequency (%)
ã794
27.9%
á395
13.9%
í335
11.8%
é319
11.2%
ç268
 
9.4%
ó243
 
8.5%
â161
 
5.7%
ú100
 
3.5%
ô71
 
2.5%
ê71
 
2.5%
Other values (7)86
 
3.0%

POP_EST
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct5110
Distinct (%)91.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38297.60126
Minimum771
Maximum12396372
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.6 KiB
2022-09-09T22:08:11.779730image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum771
5-th percentile2476.45
Q15454
median11732
Q325764.75
95-th percentile116227.95
Maximum12396372
Range12395601
Interquartile range (IQR)20310.75

Descriptive statistics

Standard deviation224288.1528
Coefficient of variation (CV)5.856454333
Kurtosis1822.798199
Mean38297.60126
Median Absolute Deviation (MAD)7558.5
Skewness37.10739116
Sum213317639
Variance5.030517549 × 1010
MonotonicityNot monotonic
2022-09-09T22:08:11.967228image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
62324
 
0.1%
29393
 
0.1%
38613
 
0.1%
49113
 
0.1%
54473
 
0.1%
56463
 
0.1%
61153
 
0.1%
161583
 
0.1%
144153
 
0.1%
38243
 
0.1%
Other values (5100)5539
99.4%
ValueCountFrequency (%)
7711
< 0.1%
8391
< 0.1%
9091
< 0.1%
9321
< 0.1%
10841
< 0.1%
11241
< 0.1%
11421
< 0.1%
11501
< 0.1%
11711
< 0.1%
12111
< 0.1%
ValueCountFrequency (%)
123963721
< 0.1%
67755611
< 0.1%
30943251
< 0.1%
29003191
< 0.1%
27033911
< 0.1%
25307011
< 0.1%
22559031
< 0.1%
19637261
< 0.1%
16610171
< 0.1%
15556261
< 0.1%

DENS_DEM
Real number (ℝ≥0)

HIGH CORRELATION

Distinct4033
Distinct (%)72.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean108.2024892
Minimum0.13
Maximum13024.56
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.6 KiB
2022-09-09T22:08:12.433631image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.13
5-th percentile2.279
Q111.57
median24.4
Q351.835
95-th percentile249.5835
Maximum13024.56
Range13024.43
Interquartile range (IQR)40.265

Descriptive statistics

Standard deviation571.8601176
Coefficient of variation (CV)5.285092068
Kurtosis226.818646
Mean108.2024892
Median Absolute Deviation (MAD)15.995
Skewness13.59588765
Sum602687.865
Variance327023.9941
MonotonicityNot monotonic
2022-09-09T22:08:12.654040image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.316
 
0.1%
19.326
 
0.1%
9.956
 
0.1%
2.795
 
0.1%
12.575
 
0.1%
11.065
 
0.1%
12.795
 
0.1%
4.675
 
0.1%
9.175
 
0.1%
4.035
 
0.1%
Other values (4023)5517
99.0%
ValueCountFrequency (%)
0.131
 
< 0.1%
0.21
 
< 0.1%
0.212
< 0.1%
0.231
 
< 0.1%
0.262
< 0.1%
0.281
 
< 0.1%
0.291
 
< 0.1%
0.321
 
< 0.1%
0.333
0.1%
0.341
 
< 0.1%
ValueCountFrequency (%)
13024.561
< 0.1%
12536.991
< 0.1%
11994.311
< 0.1%
10698.321
< 0.1%
10264.81
< 0.1%
9736.031
< 0.1%
9063.581
< 0.1%
8117.621
< 0.1%
7786.441
< 0.1%
7398.261
< 0.1%

IDHM
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct349
Distinct (%)6.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6591572711
Minimum0.418
Maximum0.862
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.6 KiB
2022-09-09T22:08:12.837906image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.418
5-th percentile0.544
Q10.599
median0.665
Q30.718
95-th percentile0.766
Maximum0.862
Range0.444
Interquartile range (IQR)0.119

Descriptive statistics

Standard deviation0.07196495405
Coefficient of variation (CV)0.1091772134
Kurtosis-0.8425533263
Mean0.6591572711
Median Absolute Deviation (MAD)0.058
Skewness-0.1556693989
Sum3671.506
Variance0.005178954611
MonotonicityNot monotonic
2022-09-09T22:08:13.026246image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.7143
 
0.8%
0.59241
 
0.7%
0.70138
 
0.7%
0.72537
 
0.7%
0.71836
 
0.6%
0.70636
 
0.6%
0.69935
 
0.6%
0.70435
 
0.6%
0.72135
 
0.6%
0.69733
 
0.6%
Other values (339)5201
93.4%
ValueCountFrequency (%)
0.4181
< 0.1%
0.4431
< 0.1%
0.451
< 0.1%
0.4521
< 0.1%
0.4532
< 0.1%
0.4691
< 0.1%
0.4711
< 0.1%
0.4731
< 0.1%
0.4771
< 0.1%
0.4791
< 0.1%
ValueCountFrequency (%)
0.8621
< 0.1%
0.8541
< 0.1%
0.8471
< 0.1%
0.8452
< 0.1%
0.841
< 0.1%
0.8371
< 0.1%
0.8271
< 0.1%
0.8241
< 0.1%
0.8231
< 0.1%
0.8221
< 0.1%

PIBCAP
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct5567
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23513.94173
Minimum4788.18
Maximum583171.85
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.6 KiB
2022-09-09T22:08:13.236457image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum4788.18
5-th percentile6963.446
Q19880.37
median17433.84
Q328729.9075
95-th percentile57861.862
Maximum583171.85
Range578383.67
Interquartile range (IQR)18849.5375

Descriptive statistics

Standard deviation24238.46308
Coefficient of variation (CV)1.030812416
Kurtosis95.38033139
Mean23513.94173
Median Absolute Deviation (MAD)8398.52
Skewness6.893354925
Sum130972655.4
Variance587503092.5
MonotonicityNot monotonic
2022-09-09T22:08:13.411899image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9973.262
 
< 0.1%
9572.422
 
< 0.1%
22953.422
 
< 0.1%
26505.891
 
< 0.1%
26471.591
 
< 0.1%
11672.751
 
< 0.1%
8202.591
 
< 0.1%
5636.831
 
< 0.1%
39844.351
 
< 0.1%
31865.011
 
< 0.1%
Other values (5557)5557
99.8%
ValueCountFrequency (%)
4788.181
< 0.1%
4901.071
< 0.1%
4903.021
< 0.1%
4970.451
< 0.1%
5062.941
< 0.1%
5064.61
< 0.1%
5079.921
< 0.1%
5200.111
< 0.1%
5263.411
< 0.1%
5309.71
< 0.1%
ValueCountFrequency (%)
583171.851
< 0.1%
419457.221
< 0.1%
362080.41
< 0.1%
337288.811
< 0.1%
306163.171
< 0.1%
304208.491
< 0.1%
291967.121
< 0.1%
268459.181
< 0.1%
229610.71
< 0.1%
225290.311
< 0.1%

NRO_EMP
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct59
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.69780754
Minimum0
Maximum532
Zeros3558
Zeros (%)63.9%
Negative0
Negative (%)0.0%
Memory size43.6 KiB
2022-09-09T22:08:13.646214image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile7
Maximum532
Range532
Interquartile range (IQR)1

Descriptive statistics

Standard deviation9.483517312
Coefficient of variation (CV)5.585743429
Kurtosis1798.315999
Mean1.69780754
Median Absolute Deviation (MAD)0
Skewness34.98315526
Sum9456.788
Variance89.9371006
MonotonicityNot monotonic
2022-09-09T22:08:13.840537image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
03558
63.9%
1944
 
16.9%
2331
 
5.9%
3187
 
3.4%
4117
 
2.1%
586
 
1.5%
653
 
1.0%
736
 
0.6%
827
 
0.5%
925
 
0.4%
Other values (49)206
 
3.7%
ValueCountFrequency (%)
03558
63.9%
1944
 
16.9%
1.6974
 
0.1%
2331
 
5.9%
3187
 
3.4%
4117
 
2.1%
586
 
1.5%
653
 
1.0%
736
 
0.6%
827
 
0.5%
ValueCountFrequency (%)
5321
< 0.1%
1751
< 0.1%
1291
< 0.1%
1211
< 0.1%
1141
< 0.1%
1031
< 0.1%
991
< 0.1%
901
< 0.1%
711
< 0.1%
681
< 0.1%

MASSA_PCAP
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct293
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.016433752
Minimum0.03
Maximum5.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.6 KiB
2022-09-09T22:08:14.022683image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.03
5-th percentile0.32
Q10.66
median0.99
Q31.15
95-th percentile2.11
Maximum5.9
Range5.87
Interquartile range (IQR)0.49

Descriptive statistics

Standard deviation0.5526934418
Coefficient of variation (CV)0.5437574663
Kurtosis4.356521646
Mean1.016433752
Median Absolute Deviation (MAD)0.28
Skewness1.585750541
Sum5661.536
Variance0.3054700406
MonotonicityNot monotonic
2022-09-09T22:08:14.215600image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.016981
 
17.6%
0.6760
 
1.1%
0.6958
 
1.0%
0.7356
 
1.0%
0.755
 
1.0%
0.6454
 
1.0%
0.6654
 
1.0%
0.6153
 
1.0%
0.7853
 
1.0%
0.6353
 
1.0%
Other values (283)4093
73.5%
ValueCountFrequency (%)
0.031
 
< 0.1%
0.119
0.3%
0.1111
0.2%
0.1211
0.2%
0.137
 
0.1%
0.149
0.2%
0.154
 
0.1%
0.168
0.1%
0.172
 
< 0.1%
0.184
 
0.1%
ValueCountFrequency (%)
5.91
 
< 0.1%
5.471
 
< 0.1%
4.431
 
< 0.1%
41
 
< 0.1%
3.882
 
< 0.1%
3.741
 
< 0.1%
3.021
 
< 0.1%
317
0.3%
2.999
0.2%
2.982
 
< 0.1%

CUSTO_UNIM
Real number (ℝ≥0)

Distinct3071
Distinct (%)55.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean223.671177
Minimum10
Maximum3912.29
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.6 KiB
2022-09-09T22:08:14.447029image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile50
Q1160.4225
median223.671
Q3223.671
95-th percentile443.5365
Maximum3912.29
Range3902.29
Interquartile range (IQR)63.2485

Descriptive statistics

Standard deviation132.7104988
Coefficient of variation (CV)0.5933285665
Kurtosis123.5232457
Mean223.671177
Median Absolute Deviation (MAD)38.175
Skewness6.010466398
Sum1245848.456
Variance17612.0765
MonotonicityNot monotonic
2022-09-09T22:08:14.639035image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
223.6712266
40.7%
50012
 
0.2%
44012
 
0.2%
5010
 
0.2%
1006
 
0.1%
2006
 
0.1%
66.674
 
0.1%
312.54
 
0.1%
333.334
 
0.1%
2504
 
0.1%
Other values (3061)3242
58.2%
ValueCountFrequency (%)
101
< 0.1%
10.561
< 0.1%
10.631
< 0.1%
11.61
< 0.1%
11.761
< 0.1%
11.791
< 0.1%
12.52
< 0.1%
12.721
< 0.1%
14.061
< 0.1%
14.171
< 0.1%
ValueCountFrequency (%)
3912.291
< 0.1%
2025.461
< 0.1%
17001
< 0.1%
1648.621
< 0.1%
1389.881
< 0.1%
1243.891
< 0.1%
1227.691
< 0.1%
1210.311
< 0.1%
1170.341
< 0.1%
1139.931
< 0.1%

MASSA_PCAP_POP
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct297
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8823504488
Minimum0.03
Maximum6.62
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.6 KiB
2022-09-09T22:08:14.826489image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.03
5-th percentile0.26
Q10.56
median0.86
Q30.99
95-th percentile1.91
Maximum6.62
Range6.59
Interquartile range (IQR)0.43

Descriptive statistics

Standard deviation0.5190683693
Coefficient of variation (CV)0.5882791469
Kurtosis10.98648258
Mean0.8823504488
Median Absolute Deviation (MAD)0.24
Skewness2.307497967
Sum4914.692
Variance0.269431972
MonotonicityNot monotonic
2022-09-09T22:08:15.138912image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.882981
 
17.6%
0.6565
 
1.2%
0.6861
 
1.1%
0.5659
 
1.1%
0.6458
 
1.0%
0.657
 
1.0%
0.4757
 
1.0%
0.4857
 
1.0%
0.6656
 
1.0%
0.6955
 
1.0%
Other values (287)4064
73.0%
ValueCountFrequency (%)
0.031
 
< 0.1%
0.051
 
< 0.1%
0.062
 
< 0.1%
0.078
0.1%
0.084
 
0.1%
0.0911
0.2%
0.119
0.3%
0.1110
0.2%
0.127
 
0.1%
0.1311
0.2%
ValueCountFrequency (%)
6.621
< 0.1%
5.91
< 0.1%
5.71
< 0.1%
5.471
< 0.1%
4.521
< 0.1%
4.431
< 0.1%
4.211
< 0.1%
4.071
< 0.1%
41
< 0.1%
3.991
< 0.1%

DESP_TOT_RSU
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct4333
Distinct (%)77.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5208595.218
Minimum12500
Maximum2348522611
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.6 KiB
2022-09-09T22:08:15.310755image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum12500
5-th percentile125553.309
Q1390115.2425
median1105360
Q35208595.22
95-th percentile10403362.43
Maximum2348522611
Range2348510111
Interquartile range (IQR)4818479.978

Descriptive statistics

Standard deviation46800726.11
Coefficient of variation (CV)8.985287617
Kurtosis1963.818127
Mean5208595.218
Median Absolute Deviation (MAD)892147.13
Skewness41.42497498
Sum2.901187536 × 1010
Variance2.190307965 × 1015
MonotonicityNot monotonic
2022-09-09T22:08:15.498204image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5208595.22981
 
17.6%
30000016
 
0.3%
50000010
 
0.2%
1000009
 
0.2%
2500009
 
0.2%
1200009
 
0.2%
2000009
 
0.2%
10000008
 
0.1%
6000007
 
0.1%
3500007
 
0.1%
Other values (4323)4505
80.9%
ValueCountFrequency (%)
125001
< 0.1%
136001
< 0.1%
144001
< 0.1%
150001
< 0.1%
156721
< 0.1%
180661
< 0.1%
18822.831
< 0.1%
19911.381
< 0.1%
220001
< 0.1%
228001
< 0.1%
ValueCountFrequency (%)
23485226111
< 0.1%
21741938801
< 0.1%
482562163.31
< 0.1%
446998774.61
< 0.1%
4088823361
< 0.1%
389926684.61
< 0.1%
380258578.71
< 0.1%
3355641721
< 0.1%
3097332131
< 0.1%
299889089.61
< 0.1%

Interactions

2022-09-09T22:08:06.799901image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:54.295581image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:55.994922image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:57.431092image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:59.114755image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:00.613128image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:02.215431image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:03.718110image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:05.231030image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:06.962031image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:54.701428image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:56.151138image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:57.587310image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:59.298970image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:00.806427image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:02.371641image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:03.887181image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:05.389535image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:07.134219image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:54.873265image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:56.312392image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:57.743525image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:59.486424image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:00.966998image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:02.543479image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:04.059016image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:05.545721image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:07.288804image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:55.028023image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:56.468619image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:58.093807image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:59.642637image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:01.125128image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:02.697917image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:04.215986image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:05.858418image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:07.448379image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:55.182610image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:56.624818image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:58.329987image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:59.814472image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:01.289688image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:02.901002image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:04.378537image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:06.021969image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:07.603838image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:55.338828image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:56.796654image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:58.490932image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:59.958407image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:01.445839image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:03.057212image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:04.562832image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:06.165518image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:07.761237image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:55.510713image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:56.952867image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:58.632988image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:00.131261image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:01.607405image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:03.224138image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:04.739397image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:06.321733image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:07.933069image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:55.666877image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:57.109081image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:58.804832image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:00.287504image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:01.759476image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:03.411607image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:04.902983image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:06.487118image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:08.089280image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:55.838712image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:57.282292image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:07:58.947945image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:00.456897image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:02.057055image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:03.561898image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:05.074817image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-09T22:08:06.643689image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2022-09-09T22:08:15.670026image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-09-09T22:08:16.091778image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-09-09T22:08:16.272965image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-09-09T22:08:16.489222image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-09-09T22:08:08.415145image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-09-09T22:08:08.743193image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

MUNICIPIOPOP_ESTDENS_DEMIDHMPIBCAPNRO_EMPMASSA_PCAPCUSTO_UNIMMASSA_PCAP_POPDESP_TOT_RSU
0Abadia de Goiás915846.8500.70826505.8903.0001.450170.2700.8821188166.840
1Abadia dos Dourados70227.6100.68918353.4801.0001.016223.6710.8825208595.220
2Abadiânia2087315.0800.68916132.9503.0001.016223.6710.8825208595.220
3Abaeté2326312.4900.69821286.4301.0001.120223.6710.990279000.000
4Abaetetuba16043987.6100.6289046.1301.0000.910136.4000.9205613966.000
5Abaiara1196558.6900.6287360.5000.0001.016223.6710.8825208595.220
6Abaíra868115.6800.6036794.2100.0001.016223.6710.8825208595.220
7Abaré2059411.4900.5756957.0400.0001.016223.6710.8825208595.220
8Abatiá736033.9500.68721529.7600.0001.520223.9501.150897893.520
9Abdon Batista253411.2500.69424517.7300.0000.920223.6710.500131142.920

Last rows

MUNICIPIOPOP_ESTDENS_DEMIDHMPIBCAPNRO_EMPMASSA_PCAPCUSTO_UNIMMASSA_PCAP_POPDESP_TOT_RSU
5560Xapuri198663.0100.59912553.2100.0001.016223.6710.8825208595.220
5561Xavantina387319.1200.74954550.9400.0001.750458.3301.750421521.640
5562Xaxim2925487.6700.75233345.9303.0000.930279.1600.9302613963.770
5563Xexéu14789127.1800.5528422.3101.0000.750223.6710.490775099.500
5564Xinguara4541610.7400.64627618.2701.0000.780223.6710.7507322501.490
5565Xique-Xique465628.2800.5858442.0600.0001.016223.6710.8825208595.220
5566Zabelê226918.9700.62310724.4500.0001.330223.6711.330756423.910
5567Zacarias27847.3200.72932979.5900.0000.820223.6710.820107400.000
5568Zé Doca5219020.7700.5958429.4001.0001.34058.0901.3202131452.000
5569Zortéa343215.7700.76121389.6900.0000.990447.1000.990454051.120